BayesPI-BAR: a new biophysical model for characterization of regulatory sequence variations

نویسندگان

  • Junbai Wang
  • Kirill Batmanov
چکیده

Sequence variations in regulatory DNA regions are known to cause functionally important consequences for gene expression. DNA sequence variations may have an essential role in determining phenotypes and may be linked to disease; however, their identification through analysis of massive genome-wide sequencing data is a great challenge. In this work, a new computational pipeline, a Bayesian method for protein-DNA interaction with binding affinity ranking (BayesPI-BAR), is proposed for quantifying the effect of sequence variations on protein binding. BayesPI-BAR uses biophysical modeling of protein-DNA interactions to predict single nucleotide polymorphisms (SNPs) that cause significant changes in the binding affinity of a regulatory region for transcription factors (TFs). The method includes two new parameters (TF chemical potentials or protein concentrations and direct TF binding targets) that are neglected by previous methods. The new method is verified on 67 known human regulatory SNPs, of which 47 (70%) have predicted true TFs ranked in the top 10. Importantly, the performance of BayesPI-BAR, which uses principal component analysis to integrate multiple predictions from various TF chemical potentials, is found to be better than that of existing programs, such as sTRAP and is-rSNP, when evaluated on the same SNPs. BayesPI-BAR is a publicly available tool and is able to carry out parallelized computation, which helps to investigate a large number of TFs or SNPs and to detect disease-associated regulatory sequence variations in the sea of genome-wide noncoding regions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Molecular characterization of a new microvariant of the G3 genotype for Echinococcus granulosus in water buffalo in Iran

In this study, molecular characterization of Echinococcus granulosus sample obtained from water buffalo originating from southwest of Iran was performed using comparative sequence analysis of cox1 mitochondrial gene. DNA was extracted from protoscoleces removed from hydatid cyst from the liver of a 2-year-old male buffalo slaughtered in Khuzestan province. Molecular and phylog...

متن کامل

Regulatory and Biosafety Challenges for Vaccines

The global regulatory plan for vaccines provides a unique opportunity to develop safe and effective ones with assured quality. Methods used by regulators address challenges of new products and technologies and also increase understanding of benefits and risks of existing products. First, the laboratory-based regulatory sciences evolve correlates of immunity and safety; or improve the product ch...

متن کامل

A fuzzy multi-objective linear programming approach for solving a new multi-objective job shop scheduling with sequence-dependent setup times

This paper presents a new mathematical model for a bi-objective job shop scheduling problem with sequence-dependent setup times that minimizes the weighted mean completion time and the weighted mean tardiness time. For solving this multi-objective model, we develop a fuzzy multi-objective linear programming (FMOLP) model. In this problem, a proposed FMOLP method is applied with respect to the o...

متن کامل

Full Length Characterization of PA Gene of H9N2 Isolated from Broilers During 1998 to 2009

Background and Aims: Avian Influenza (AI) H9N2 subtype was first reported to infect turkeys in the United States in 1966 and has been panzootic in Europe and Asia. The impact of avian influenza caused by H9N2 viruses in Iran is now significantly more severe than in previous years. Methods: Sequence analysis and phylogenetic study of the complete coding region Polymerase A (PA) gene of H9N2 subt...

متن کامل

Single Nucleotide Polymorphisms and Association Studies: A Few Critical Points

Uncovering DNA sequence variations that correlate with phenotypic changes, e.g., diseases, is the aim of sequence variation studies. Common types sequence variations are Single nucleotide polymorphism (SNP, pronounced snip).SNPs are the third-generation molecular marker. SNP represents a DNA sequence variant of a single base pair with the minor allele occurring in more than 1% of a given popula...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 43  شماره 

صفحات  -

تاریخ انتشار 2015